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METHOD, APPARATUS, AN© SYSTEM FOR SYNTHESIZING AN AUDIO 
PERFORMANCE USING CONVOLUTION AT MULTIPLE SAMPLE RATES 

CROSS REFERENCE TO RELATED APPLICATIONS 

This application claims The benegt of US provisional parent application nos, 60/510,068 and 
60/510,019, both filed on October 9, 2003. 



This application includes a Computer Listing Appendix on compact disc, hereby incorporated by 
reference. 

1. pie}d of the Invention 

[0001] The present invention relates generally to audio processing and, more particularly, to a 
method, apparatus, and system for synthesizing an audio performance in which one or more 
acoustic characteristics, such as acoustic space, microphone modeling and placement, are varied 
using pseudo-convolution processing techniques. 
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o nftegf^ pTioTi of the Prior Art 

[P002J Digital music synthesizers are known in the act. An example of sach a digital music 
synthesizer is disclosed in U.S. Patent No. 5, 502, 747, hereby incorporated by reference. The 
system disclosed in the '747 patent discloses multiple component filters and is based on hybrid 
time domain and frequency domain processing. Unfortunately, the methodology utilized in the 
5,502,747 patent is relatively computationally intensive and is thus not efficient. As such, the 
system disclosed in the *747 patent is primarily only useful in academic and scientific 
.7) applications where computation time is not critical. Thus, there is a need for an efficient 

synthesizer that is relatively more efficient than those in the prior art. 

SUMMARY OF THE INVfflffyiON 
[0003] The present invention relates to a method, apparatus, and system for use in synthesizing 
an audio performance in which one or more acoustic characteristics, such as acoustic space, 
microphone modeling and placement, can selectively be varied- In order to reduce processing 
time, the system utilizes pseudo-convolution processing techniques at a greatly reduced 
slH'?' processor load. The system is able to emulate the audio output in different acoustic spaces, 

separate musical sources (instruments and other sound sources) from musical context; 
interactively recombine musical source and musical conrext with relatively accurate acoustical 
integrity, including surround sound contexts, emulate microphone models and microphone 
placement, create acoustic efjects, such as reverberation, emulate instrument body resonance and 
interactively switch emulated instrument bodies on a given musical instrument. 
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invention. 



[0005] FIGS. 1C and ID are alternate exemplary graphical user interfaces for use with the 
present invention. 



[0007] FIG. 3 is a block diagram of an exemplary embodiment of a run-time input channel 
processing routine designated by the block 50 in Fig. 2 in accordance with the present invention; 

[0008] FIG- 4 is a more 4etailed block diagram of the embodiment illustrated in FIG. 2; 

[0009] FIG- 5 is a block diagram illustrating a process channel routine designated by the block 
S3 in Fig. 2 in accordance with the present invention; 



[0010] FIG- 6 is a time domain response of an exemplary sound impulse; 

[0011] FIG, 7 is a block diagram of an audio collection and index sequencing routine illustrated 
by the block 178 in Fig. 5 in accordance with the present invention, represented by the blocks 
178a, 178b and 178c-> which illustrate different operational modes for the Auctfo Collection aud 
Index Servicing Routine in accordance with the the present invention- 
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10012J FIG, 8 is a block diagram of coefficient index sequencing routine illustrated by the block 
1 70 in Fig- 5 in accordance with the present invention; and 

[0013] FIG. 9 is a block diagram of the collection index modulo update routine illustrated by the 
block 192 in Fig, 7 in accordance with the present invention- 

(0014] FIG. 10 is a block diagram of the frame modulo update in accordance with the present 
:?\ invention; 

100151 FIG. 11 is an exemplary block diagram of the tail extension processing in accordance 
with the present invention; 

[00161 FIG. 12 is a hardware block diagram of a computing platform for use with the present 
invention* 



S.^ a£ B description 

[00171 The present invention relates to an audio processing system for synthesizing an acoustic 
response in which one or more acoustic characteristics are selectaWy varied. For example, the 
audio response in a selectable musical context or acoustical space can be emulated. In particular, 
a model of virtually any acoustic space, for example, Carnegie Hall, can be recorded and stored. 
In accordance with one aspect of die invention, the system emulates the acoustic response in the 
selected acoustic space model, such that the audio input sounds as if it were played in Carnegie 
Hall, for example. 
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[001 8J In accordance wiftt one aspect of the invention, the system has the ability to separate 
musical sources (i.e. instruments and other sound sources) from the musical context (i.e. acoustic 
space in which the sound sources are played). By emulating the response to selectable music 
contexts, as described above, the acoustic response to various musical sources can be emulated 
for virtually any acoustic space* including the back seat of a station wagon. 




[0019} Various techniques can be used for generating a model of an acoustic space. Hie model 
may be considered a fingerprint of a room or other space or musical context. The model is 
created, for example, by recording the room response to a sound impulse, such as a shot from a 
starter pistol or other acoustic input. The sound impulse may be created* for example, by placing 
a speaker in the room or space to be modeled and playing a #ecpiency sweep. More particularly, 
a common technique is the sine sweep method which has a sweep tone and a. complementary 
decode tone. The convolution of the sweep tone and the decode tone is a perfect single sample 
spiJce (impulse). After the sweep tone is played through the speaker and recorded by a 
microphone in the room, the resulting recording is convolved with the decode tone which reveals 
the room impulse response. Alternatively, simply firing a starter pistol in the space and recording 
the response is another way. Alternatively, various "canned'* acoustic space models are currently 
available on the Internet at htrp:/www.echochamber.ch[?]; http:/afciverb.daw-macxom; and 
http:/poisevault.com . 



[0020] hi accordance with other aspects of the invention, the system is able to emulate other 
acoustic characteristics, such as the response to one or more predetennined microphones, such 
as a vintage AJCG C-12 microphone. The microphone is emulated in the same manner as the 
musical context, to particular, the acoustic response to an acoustic impulse of the vintage 
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microphone, for example, is recorded and stored- Any musical source played through the system 
is processed so that it sounds as if it were played through the vintage microphone. 

[0021] The system is also able to emulate other acoustic characteristics, such as the location of 
an audio source within an audio context. In particular, in accordance with another aspect of the 
invention, the system is able to combine a sound source, the response of an acoustic space, a 
microphone and an instrument body resonance response into separate, reconfigurable audio 

r>x s sources in an audio performance. For example, when an instrument, say a violin, is performed 
in a room and recorded through a microphone, the resulting audio contains tonality and 
reverberation dictated by multiple impulse elements, namely the microphone, room acoustics and 
the violin body- In many cases it is desirable to control these three elements individually and 
separate from each other and the suing vibration of the violin* By doing so, different choices of 
microphone, room environment or violin body can be independently selected by the user or 
content author for an audio performance. In addition, the system is able to optionally emulate 
the response to another audio characteristic, such as the location of an audio source relative to 

^\ the microphone placement, thus allowing the audio source to be virtually moved relative to the 

microphone. As such, drums, for example, can be made to sound closer or further apart from die 
microphones. 

* 

[0022] fo accordance with another aspect of the invention, the system is a real time audio 
processing system that is significantly less computation intensive than known music 
synthesizers, such as the audio processing system disclosed in the *747 patent discussed above. 
In particular, various techniques are used to reduce the processing load relative to known 
systems. For example, as will be described in more detail below, in a 'Turbo" mode of 
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operation, the system processes input audio samples at a slower sample rate than the input 
sample rate thus reducing the processor load up to 75 %, for example. 

An exemplary host computing platform for use with the present invention is illustrated in FIG- 

* 

12 and generally identified with the reference numeral 20. The host computing platform when 
loaded with the user interface and processing algorithm described below forms an audio 
synthesizer. The host computing platform 20 includes a CPU 22, a random access memory 
(RAM) 24, a hard drive 26 , as well as an external display 28, m external microphone 30 and one 
or more external speakers 32. Minimum requirements for the host computing platform 20 are ; 
Windows XP (Pro, Borne edition, embedded or other compatible operating system), an Intel 
Pentium 4, Celeron, Athlon XP I GHz or other CPU, 256MB RAM, 20GB hard drive. 

USER INTERFACES 

100231 FIGS. 1A. -ID illustrate graphical representations of exemplary embodiments of a 
control panel 10Q which may be used in connection with the present invention. Only one 
embodiment is described for simplicity. In particular, in the embodiment illustrated in FIG- 1 A, 
the control panel 100 includes a drop-down mem* 102 which may be used to select a 
predetermined musical context (e.g., dark, hardwood floors, medium...), a drop-down menu J04 
which may be used to select a ''raw impulse", a drop-down menu 106 which may be used to 
select a particular musical instrument {e.g., V 1 violins, Legato down bows), a drop-down menu 
IDS which may be used to select an original microphone (e.g., NT 1 000), and a drop-down menu 
1 10 which may be used to select a particular replacement microphone (e.g., AKG414)- A 

« 
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display area U2 is provided for displaying a brief textual description of a microphone placement 
selection, as described in more detail below. 

[0024] A button 114 is provided for selectively enabling and disabling a "cascade** feature 
associated with application of the raw impulse selected via the drop-down menu 104 to an audio 
track. A button 116 is provided for selectively enabling and disabling an "encode" feature wbich 
petxnits the application of a user-selected acoustic model to the instrument selected via the drop- 
^ down menu 106. A display area 118 optionally may show a graphical or photographic 

•v. ,f 

representation of the musical context selected by the drop-down menu 102. 

[0025] A button 120 is provided for selectively activating and deactivating a mid/side (M/S) 
micropbone pair arrangement for left-side and right-side microphones. Additional buttons I2l 7 
122, 123, and 124 are provided for specifying groups of microphones, including, for example, all 
microphones (button 121), front CF*) microphones (button 122), wide ("W") microphones 
(button 123), and rear or surround C*S") microphones (button 124)- 

' :-: ^ 

[0026] The user also may enter microphone polar patterns and roll-off characteristics far each of 
the microphones employed in any given simulation- For that purpose, buttons 124, 125, 126, 
127, 128, and 129 are provided for selecting a microphone roll-off characteristic or response. 
For example, buttons 125 and 126 select two different low-frequency bumps; button 127 selects 
a flat response, and buttons 128 and 129 select two different low-frequency roll-off responses, 
respectively. Similarly, buttons 130-134 allow a user to select one of several different well- 
recognized microphone polar patterns, such as an omni-directional pattern (button 130), a wide- 
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angle cardioid patera (button 131), a cardioid pattern (button 132), a hyper cardioid pattern 
(button 133), or a so-called tv figure-8" pattern (button 134). 

[0027J The control panel 100 also includes a placement control section 135, which, in the 
illustrated embodiment, contains a plurality of placement selector/indicator buttons (designated 
by numbers I through 1 8), These placement selector/indicator buttons allow a user to specify a 
position of musical instruments within the user-selected musical context (e g* , the position of the 
instrument selected by the drop-down menu 106 relative to the user-specified microphoue(s)). 
The graphical display area U8 may display a depiction of the perspective of the room or musical 
context selected by the drop-down menu 102 corresponding to the placement within tbat room or 
musical context specified by the particular placement selector/indicator button actuated by the 
user. Of course, as will be readily apparent to those of ordinary skill in the an, many different 
alternative means may be employed to permit a user to select instrument placements within a 
particular musical context in addition to or instead of the placement selector/indicator buttons 
shown in FIG. I A, For example, a graphical depiction of the room or musical context could be 
/) displayed, and a mouse, trackball, or other conventional ppinter control device could be used to 

move a location designator to a predetermined placement within the graphical depiction of the 
it>om or musical context corresponding to whatever placement within that room or musical 
context may be desired by the user. 

(0028 j As also shown in FIG. 1A, the control panel 100 also includes a **mic-to-output" control 
section 136, which includes an array of burtons allowing a user to assign each microphone used 
in a given simulation to a corresponding mixer output channel. As shown, the control panel 100 
provides for seven mixer output channels represented by the columns of buttons numbered one 
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through seven in the mic-to-output control section 136. Seven mixer output channels allow for 
seven microphone to be usea in a given simulation (e.g., left and Tight front* left and right wide, 
left and right surround* and a center channel). Of course* those of ordinary skill in the art will 
readily appreciate that more or fewer mixer output channels may he provided in any given 
embodiment of the present invention based upon the needs of a particular simulator- For 
example, in a stereo simulator, only two mixer output channels need be provided. In order to 
assign a particular microphone to a particular mixer output channel, the user need only depress 
r"h the button in the row of buttons corresponding to the particular microphone and the column of 

buttons corresp ondin g to the particular mixer output channel. The controls in each row of the 
mic-to-output control section 136 operate in a mutually exclusive fashion* such that a particular 
microphone can be associated only with one mixer output channel at a time, 

\QQZ9) The mic-to-output control section 136 also includes a button 140 for selectively enabling 
and disabling a "simulated stereo 7 * mode in which a single microphone simulation or output is 
processed to develop two stereo) mixer output channels* This maybe used, for example, to 
enable a simulated stereo output to be produced by a slow computer which does not have 
sufficient processing power to handle full stereo real-time processing. A button 142 is provided 
for selectively enabling a ""true stereo" mode* which simply couples left and right stereo 
microphone simulations or outputs to two mixer output channels. Further, a button 144 is 
provided for selectively enabling and disabling a "seven-channel" mode in which each of seven 
microphone simulations or outputs is coupled to a respective mixer output channel to provide for 
full seven-channel surround sound output. 
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10030] A button 146 is provided for selectively enabling and disabling a "tail extend" feature 
which causes the illustrated synthesizer to derive the first N seconds of the synthesized response 
by peifoiroing a full convolution and then to derive an approximation of the tail or terminal 
portion of the synthesized response using a recursive algorithm (described in more detail below) 
which is lossy but computationally efficient. Where exact acoustically simulation is not 
required, enabling the tail extend feature provides a trade-off between exact acoustical 
simulation and computational overhead. Associated with the tail extend feature are three 
? ^ parameters, Overlap, l-evel, and Cutoff; and a respective slider control X48, 150, and 152 is 
provided for adjustment of each of these parameters. 

[0031] More particularly, the slider control 148 permits adjustment of an amount of overlap 
between the recursively generated tail portion of the synthesized response or output signal and a 
time-wise prior portion of the output signal which is calculated by convolution at a particular 
sample rate. The slider control 150 permits adjustment of the level of the recursively generated 
portion of the output signal so that it more closely matches the level of the time-wise prior 
£\) convolved portion of the output signal. The slider control 152 permits adjustment of the 

frequency-domain cutoff between the recursively generated portion of the output signal and the 
time-wise prior convolved portion thereof to thereby smooth the overall spectral damping of the 
synthesized response or output signal such that the frequency-domain bandwidth of the 
recursively generated portion of die output signal more closely matches the frequency domain 
bandwidth of the convolved portion thereof at the transition point between those two portions. 

[0032] A plurality of further slider controls may be provided to allow a user to adjust the level 
corresponding to each microphone used in a particular simulation- to the illustrated 
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embodiment, slider controls 154-160 are provided for adjusting recording levels of each of seven 
recording channels, each corresponding to one of the available microphones in the iUustrated 
simulation or synthesizer system. la addition, a master slider control 161 is provided to allow a 
\jser to simultaneously adjust the levels set by each of the slider connote 154-160. As shown, a 
digital read-out is provided in tandem with each slider control 154-J61 to indicate numerically to 
the user the level set at any given time by the corresponding slider control 154-161. In the 
illustrated embodiment, the levels are represented by 11 -bit numbers ranging from 0 to 204?. 
'J However, it should be evident to those of ordinary skill in the art that any other suitable range of 

levels in any suitable units could be used instead. 

[00331 The control panel 100 also includes a level button 164, a perspective button 166, and a 
pre-delay button 168. The level button 164 allows a user to selectively activate and deactivate 
the level controls 154-161 . The perspective button 166 allows the user to selectively activate and 
deactivate a perspective feature which allows the slider controls 154-161 to be used to adjust a 
parameter which simulates, for any given simulation, varying the physical dimensions of the 
f : >j musical context or room selected by the drop-down menu 102- The pr^-delay button 168 allows 

-; , J 

the user to employ the slider controls 154-161 to adjust a parameter which simulates echo 
response speed (by adjusting the simulated lag between tfce initial echo in a recorded signal and a 

predetermined amount of echo density buildup)- 

» 

[00341 Alternate exemplary graphical user interfaces (GUI) are illustrated in FIGS. 13-1D. 
These GUIs also permit a user to adjust the various parameters of the system in accordance with 
the principles of the present invention. Since the GUI provides essentially the same functionality 
as the control panel illustrated in FIG. 1A» the alternate GUIs are not described further. 
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PROCESSING ALGORITHM 



" J 



(0035] FIG. 2 depicts a high-level software block diagram, illustrating a single audio channel for 
simplicity, of an exemplary embodiment of an audio processing system 48 in accottlance with 
the present invention. The audio processing system 48 includes a runtime input channel 
processing routine 50, a runtime sequencing, control, and dam manager 52, 3 process-channel 
module 53 which includes a multi-rate adaptive filter 54, a collection and alignment routine 56, 

♦ 

and a tail extension processor 58. As shown, input digital audio source samples are digitized by 
an analog to digital converter (not shown), for example, a 16 hit or 24 bit, PCM, 444, 48, 88.2, 
96, 176.4 or 192W32 sample rate, mono or multi channel ADC, such as the stereo ADC within 
the Cirrus Crystal CS4226 codec, and applied to the runtime input channel processing routine 50, 
which converts the sample, which are in the time domain to the frequency domain and applies 
the frequency domain samples to the runtime sequencing, control and data manager 52. In 
addition, impulse response data representing, for example, the impulse responses corresponding 
to the characteristics of various audio characteristics, such as, user-selected microphones, 
musical context(i.e. acoustic space), musical instruments, and relative positioning of the user 
selected microphones and/or musical instruments within the user selected musical context are 
stored in a coefficient storage memory device 60, A loadtime coefficient processing routine 62 
and a runtime coefficient processing routine 64 are used to successively process coefficients 
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from the coefficient memory storage device 60 based on foe user input 66 provided via, for 
example, a control panel or graphical user interface, such as depicted in FIGS- 1 A -ID 

{W36J . m order to reduce runtime CPU resource utilization, we loadtime coefficient processing 
routine 62 pre-processes at load time we time domain impulse coefficients from storage device 
60 with audio signal processing to ftcititate changes to the audio response based on user input, 
and converts the resulting time domain coefficient data into the frequency domain. The runtime 
} sequencing, control, and data manager 52 processes the audio source input samples and the 

processed impulse response coefficients as to facilitate CPU load balancing and efficient real 
time processing. The processed samples and coefficients from the runtime sequencing, control, 
and data manager 52 are applied to the process channel module 53 in order to produce audio 
output samples 68 , which emulate the audio response of we input audio source to various user 
selected audio characteristics. 

[0037] FIG. 3 illustrates a block diagram of one exemplary embodiment of the runtime input 
channel processing routine 50 shown in FIG. 2. Referring to FIG. 3, the runtime input channel 
processing routine 50 receives digitized audio source samples at a first sample rate, for example 
4SUH2, from a digital sample buffer (IOBUF) 70. The digital sample buffer 70 is sized 32 audio 
samples of 32bits each. Digital samples from the digital sample buffer 70 are copied on a frame- 
by-frame basis by frame copy routines (B) and (A) 72 and 74, respectively, to respective frame 
buffers (XJJB) and (Xl-A) 76 and 78, respectively. More particularly, the same input samples are 
framed into two separate buffers, XJJB and XJ-A, of potentially different frame sizes as to 
facilitate subsequent processing at two different sample rates. The frame size of the XJJ3 buffer 
is smaller relative to XLA, Typically one eighth in size relative to XLB. The tail maintenance 
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routine 80 copies a finite impulse response (FIR) filler length of data from the beginning to the 
end of the frame buffer XtA, as to cover the FJJL coefficient overlap required by the 2:1 
decimation filter 90, The decimation filter 90 dowusamples the entire XLA frame size of audio 



buffers 76 and 78 into corresponding frequency-domain data. More particularly, the FFT routine 
84 produces a fast Fourier transform of an XLB frame from the frame buffer 76 and provides the 
transformed data to a frequency domain buffer (XLBF) 94* Jh a turbo mode, frame data from the 
frame buffer (XLA) 78 is filtered by a low-pass filter, for example a 2:1 filter to reduce the 
sample rate to 54 of the audio input source sample rate. The low pass filter simply reduces the 
audio bandwidth to one half of the input sample bandwidth and truncates the result by saving 
only every other sample. The filtered samples are stored in a decimation frame buffer (X1-1F) 92- 
This decimation frame buffer 92 contains the band reduced and truncated samples produced by 



low pass filtering and throwing away every other sample, and passes these samples to the FFT 
routine 86 which performs an FFT on the decimated, filtered frame data and stores the resulting 
frequency domain frame data in a frequency domain buffer (XJ-AF) 96. 

{QQ39J In the event a user wishes not to employ tail end processing (le., preferring instead to 
achieve the acoustic accuracy of ftiU-sampte-rate convolution which results greater processing 
power), the FFT module 88 may be operated at the full sample rate (ie. same sample rate as the 
. inpux samples) to transform the frame data from the frame buffer (XI-A) 78 at its original sample 

-15 
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source samples, said frame size corresponding to the lower sample rate, for example y 2 the audio 
source sample rate and copied these samples to the decimation frame buffer (Xl.Jp) 92, 



100381 A fast Fourier transform CFFD module 82, including FFT routines 84, 86, and 88, is 
provided for converting frames of data, which are represented in the time domain in frame 
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rate and thus provide fuU-sampie-rste frequency domain data to the frequency domain buffer 96 
(XLAF). 

L004D] Operation of the frame copy routines 03) and (A) 72 and 74, the tail maintenance routine 
80, the FFT module 82, and the low-pass filter 90 is handled by a frame control process routine 
98. The frame control propess routine synchronizes the timing of the frames so that they work in 
phase together, assembling a frequency domain frame which is larger than the time domain 
* \ frame size, such that an entire frequency domain frame is made up of multiple time domain 

frames. The frame control process also synchronizes the multiple sample rates and frame sizes 
of the XL A, XLB, XLAF, and XLBF buffers, as fed into the real time scheduling and CPU load 
balancing routines within the runtime sequencing, control and data manager 52. 

FIG 4 depicts a block diagram illustrating in greater detail the audio processing system shown in 
FIG 2, including an expanded illustration of the flow of data that occurs in operation of that 
system. As shown, a plurality of audio source input channels are shown, CH. 1, CH, 2..,.CH- N- 
■~s As discussed above, the audio source input channels CH- 1, CH. 2. . ..CH. N. are each processed 

by the runtime input channel processing routine SO (FIG. 3) which is used to convert the time 
domain audio source samples, segregated into multiple sample rates, to their respective 
frequency domain buffers for further processing. As discussed above, the frequency domain 
samples for each channel are stored in a plurality of frame buffers XJ3fl, XLAf2,.; XLAfN, 
identified with the reference numerals X02, 103 and 104, respectively, one frame buffer for each 
phatmel. Each of the frame buffers 102, 103, 104 is sized to receive one frame of input audio 
samples at a time from a corresponding one of the N audio input channels, for example 2048 
32bit samples. The run time memory 100 also includes a plurality of data structures 106, 107, 
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108 which represent the coefficients, for example, of M impulse responses for M acoustic 
characteristics (i.e. acoustical space models or other acoustic characteristics), and their respective 
control parameters, indices, and buffers. The impulse response data is retrieved from the co- 
efficient memory storage device 60 by a load and process routine 110 in response to a user 
command monitored by a load and process routine 110 via an VQ control routine 111. Routine 
110 is comprised of routines 62 and 64 (FIG 2). Jh particular, the VO control routine simply 
monitors user inputs to the GUIs illustrated in FIGS. I A or IB and retrieves the data structures of 
the co-efficients that correspond to the user selected acoustic characteristic. The load and 
process routine 110 simply loads the selected data structures into the run-time memory 100 on a 
channel by channel basis. These data structures are identified in the runtime memory as 
IMPULSE 1, IMPULSE 2.. . IMPULSE M, 106, 107 and 108, respectively. As shown in FIG 4, 
the frequency domain data PXLBfl, PXLAfl ; PXLBf2, PXLAf2;. . .PXLBfN, PXLAfN from the 
frame buffers 102, 103, 104 and the data structures plcl, pic2...plcM, 1Q6, 107, 108, 
respectively is communicated to a channel sequencing module 118 which serves to time- 
rouluplex the data for processing by the process 53. In particular, in&rmation passed from the 
channel sequencing module 118 to the process channel module 120 includes, fax each of the N 



audio input channels, data representing a time synchronized first framed portion of each frame of 
data received via that audio input channel (PXLBitf), i = 1, 2....N), data representing a time- 
synchronized second framed portion of the same data received via that audio input channel 
(PXLAf(i), i - 1,2, ...N). Other variables are also passed to the process channel module 53: 
Plc(i) is a pointer to the tagPynamieCbannelData data of impulse channel (i), PlOBufU) is a 
pointer to the output buffer of impulse channel (i), dwFRAMESize is the number of time domain 
samples input into and output from the process channel routine 53 each time it is called by the 
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host, PI is the pointer to the instance data structure, which is unique to the instance but shared 
amongst the plurality of channels for each instance, simulated stereo is a control bit which 
enables/disables the simulated stereo function, MVS decode is a control bit which 
enables/disables the Mid-Side audio decoder function, and control is a real time scheduling 
control bit which enables left and right channels to be processed on separate frames to facilitate 
real time processing CPU load balancing. All of this data passes from the channel sequencing 
module 118 to the process channel routine 120. As also shown in FJG 4, bi-directional 
communication is provided between The process channel routine 53 and the run time memory 
100, indicated by arrows 122, 

[0Q4H A plurality of T output buffers OUT 1, OUT 2. ..OUT T, identified with the reference 
numerals 112, 113, 114, are provided in the run-time memory 100. Bach of the output buffers 
112, 113 and 114 is sized to receive one frame of output audio samples at a time for outputting 
the respective T output sample streams. The output buffer pointers plOBufl, 
plOBuf2...pIOBufT for the user selected audio characteristic of each channel CH- 1, CH. 
2...CH. N of the input audio samples is time multiplexed by the channel sequencing module 1 1 8 
to provide independent references to process channel 53, which synthesizes audio output streams 
in real time into the output buffers OUT 1. OUT 2... OUT T, identified with the reference 
numerals 112, 113 and 114. 



[0042] Multiple copies or multiple instances of the same audio processing system 48 can be used 
simultaneously or in time multiplex. The multiple instances allow for simultaneous processing, 
for example, of different musical insmtments. For example, the relative location of each 
instrument in an orchestra relative to a microphone can be simulated. Since such instruments are 
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played simultaneously, multiple copies or instances of the audio processing system 48 are 

» 

required in order to synthesize the effects in real time. As such, the channel sequencing module 
118 must provide appropriate references of all of the copies or instances to the process channel 
module 53. As such, an instance data buffer J, identified with the reference numeral 116, is 
provided in the runtime memory 100 for each instance of the audio processing system 48 being 
employed. 

f_QQ43J in order to provide a clear understanding of the audio processing involved in the present 

•J} 

invention, a time-domain representation of an exemplary impulse response input signal is shown 
graphically in FIG 6. As shown, the impulse input signal includes a time wise first portion 
designated "b") and a continuous, time wise second portion designated "a" and a "tail" portion 
that extends continuously beyond the time wise second portion "a", m the time domain, the 
impulse input signal may be partitioned into groups of samples. The first portion of the impulse 
input signal (herein after referred to as the "b" portion'*) preferably includes a number of samples 
corresponding to the major frame size for FFT blocks 30PNA2, and the time wise second 
r ^ portion (herein after referred to as the "a" portion") preferably is made up of a number of such 

femes of samples. There is a minor frame size for FFT blocks XLENB2, for example, one 
eighth of the major block size in the exemplary embodiment. The total number of samples 
making up the audio signal illustrated in FIG 6 is denoted by FTAPS2. A pointer hmdex is used 
to designate a relative position within the aggregate collection of samples making up the 
illustrated audio impulse response or input signal. 

{0044J There is a unique co-efficient for the "a" portion and H b" portion, HindexA and HindexB 
respectively. FJG 8 illustrates the co-effficient index sequencing routine, iUustrated in FIG. 5 by 
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fce black 170, and sjiows the coefficient index sequencing derived ftom XLENA2, XXENB2, 
HmdexA, HwdexB, and HUENAA, the later of wnicn is scaled to half toe size of XUENA2 
when operating in turbo mode, otherwise is equal to XLENA2. The HindexA and HiitfexB 
indices are derived within block 53 and switched by a control signal LPbaseAB to adapt the 
coefficients within the adaptive filter to accommodate the ^ A and ^1 » portions of the 
impulse response. 

^ 1004SJ FIG 5 depicts a block diagram illustrating in greater detail the operation of the process 

& channel routine 53 (FIGS. 2 and 4) as described above, and in particular the pseudo convolution 

processing routine in accordance with the present invention for use on general purpose CPUs, 

« 

such as an Intel Pentium 4 processor at a greatly reduced processor load- Conventional frequency 
domain convolution is simply a vector multiply of frequency domain multiplicands, followed by 
an inverse Fourier or Fast Fourier transform of the products at a single, uniform, non-time 
varying, fixed sample rate and block size, resulting in significantly higher computation and 
throughput. Conventional convolution does not contain the processes necessary for framing, 

' -\ synchronteing or processing the multirate input audio signal, the mulurate impulse responses, nor 

% 

does it employ an adaptive filter with time varying coefficients for a given impulse response. 

(0046) As shown in FIG- 5, dynamic channel data ISO, identifted in FIG. 4 as "CONTROL 
from the channel sequencing module 118 is applied to the process channel routine 53- Jn 
particular, for each copy or instance of the audio processing system 48, the channel sequencing 
routine formulates a dynamic data structure 150 for each channel based upon the user selected 
audio characteristic and the incoming audio source samples. More particularly, as mentioned 
above, input audio samples are converted to the frequency domain and stored in the runtime 
memory 100. The impulse response coefficients to tiae various user selectable acoustic 
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characteristics are likewise stored in trie runtime memory 100. AU of this data is formulated into 
a data structure, for example, the exemplary data structure 150, illustrated in FIG. 5. One data 
structure 150 is provided for each channel of convolution currently being processed in real time 
and assigned a separate input channel. 



158, 160, 162, 164, 166, and 168, as shown. As shown in FIG. 5, the frequency domain co- 
efficients HxQF) of a finite impulse response (FIR) filter are used to form the field 154. The field 
152, accessed via specific reference within the structure pointed to by the plc(n) pointer 
identified in FIG.4, may be used to represent two indexes representing as follows: (1) an index 
reference (biudex B) representing that a time wise first portion impulse response input data is 
being processed; (2) an index reference (bindex A) representing that a time wise second portion 
of the impulse response input data is being processed; and (3) additional control data, such as 
runtime MicLevel, Perspective, PirectLevel, tail extension audio processing control parameters. 
Simulated stereo control runtime parameters, and other audio digital signal processing 



the tagPynamicChannelPata data structure table below. 

J0048J The field 154 (FIG 5) contains the frequency domain filter coefficients Hx(f>, which may 
also be in the form of acoustic impulse responses, populated with a frequency domain 
representation of an acoustic model being simulated in the form of a FIR [e.g.. a particular 
acoustic space, a particular microphone, a particular musical instrument body resonance 
characteristic, etc.). This finite FIR is stored in the data structure Hx(f), sized to accommodate 
twice the number of time domain samples malting up the acoustic model (».<?.. uMPSIZE*2), in 
order to accommodate the frequency domain representation. 
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J0049] The field 156 is dynamically generated and contains an intermediate part of the product 
of a vector multiplication from a vector multiplier 172 of the FIR co-efficients pointed to by a 
Co-efficient Index Sequencing Routine 170 and the frequency domain audio source input data 
XLBF. XLAF, illustrated in the box, identified with the reference numeral 174, for the N 
channels. The buffer X13F contains the full sample rate, early portion of the impulse response 
or FIR filter coefficients in the frequency domain output from (FIG 3) into field 94, and when 
turbo mode is enabled buffer Xt~AF contains the half sample rate, later portion of the impulse 
- response or FIR filter coefficients, in the frequency domain output from (FIG 3) into field 96. 

The Cf intermediate product is converted to die time domain by the Inverse Fast Fourier 
Transform routine JFFT 176 and stored in the field 158. The time domain data in field 158 Mien, 
halfHlen, is applied to a Audio Collection and Index Sequencing Routine 178 which along with 
the collection indices data , acofindexA&B, acolindexPrevA&B, in field 160 is used to develop 
me data in fields 162, 164 and 166, as discussed below. 

[0O50J Hlen represents in the time domain the equivalent of one frame of frequency domain 
::0, data, halffflen represents in me time domain the equivalent of one-half frame of frequency 

domain data. 

{0051} The field 160 contains indices to past and present frames in the audio collection buffer 
for the B portion of the impulse response (acoliudexprevB and acolindexB, respectively, and for 
the A portion of the impulse response, acoUndexprevA and acolindexB, respectively. The field 
162 contains the audio collection buffer (acol) 162 corresponding to the processing which 
occurs at the full sample rate as indicated by me block 178a (FIG- 7) , (an intennediate 
accumulative that facilitates the overlap-add or overlap-subtract) and which comprehends frame- 
based overlap and modulo addressing. This buffer (acol) 162 , is modulo addressed as indicated 
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by the bloclc 192 (FIG. 7) and is sized of length impulse size (roe time-domain length of the 
impulse response) into which successive frames are overlap added or overlap subtracted. 

10052) FIG 9 and FIG 10 show more detail on the maintenance of the audio collect and hindex 
indices within the audio coUection and index sequencing 178. Prior to a call to the vector 
multiply 172, Hlen is assigned to either XLeuA or XlenfS and acolindex is assigned to 
acolindexA or acoUndexB, and the respective impulse coefficient and collection buffer indices 
are modulo updated. This is done as to adapt the frequency domain filter coefficients on the fly 
to bloclc processing of multiple portions of an impulse response within the same filter module. 

J0Q53] As shown in FIG. 10, the coefficient index hmdex, illustrated in FIG. 6, is set depending 
on which part of the waveform shown in FIG. 6 is being processed- As shown in Fig. 10, if the 
early part of the waveform is being processed, identified in FIG. 6 as "b", as determined by the 
decision bloclc 203, the coefficient index hindex is set to O-If the later portion of the waveform is 
being processed, identified in FIG. 6 as "a"„ as determined hy the decision bloclc 205, the 
coefficient index is set to XienA2. which is the beginning of portion "a". 



[00541 as shown in (FIG 7), when the vector multiply and IFFT stages are operating at half of 
the full sample rate, as when in turbo mode and when processing the sample from portion A 
asdetermrned by the decision blocjc 200, the audio collection and index sequencing field as 
generated by the audio collection and index sequencing routine 178 (FIG 5) and respective 
coUection indices will phase align and overlap-add respective audio frames from the ct field 158 
into the buffer* acoJh), audio collect half sample rate field 164. Also shown in (FIG 7), when tail 
extension is selected and set and the coefficient index, hindex, is greater than the tail collection 
limits determined by the decision block 178b, then roe audio coUection and index sequencing 



-23 

AMENDED SHEET 

PAGE 26136 * RCVD AT 8/5/2005 3:00:20 PM [Eastern Daylight Time)' SVR:USPTO-EFXRF-6/25 * DN18:2733201 * CSID:3129021061 1 DURATION (mm-ss):09-30 



kmj& zs mr m 

P C TV U S 01 «4v'- 3 3 S 9 O . O 5 O B S O O 5 

08-05-05 02:09pm Froro-KATTEN MUCHIN R0SENMAN 13129021061 3129021061 T-774 P. 27/36 F-363 

routine operates as illustrated by the bloclc 178c and the audio collection and index sequencing 
field 178 (FIG 5) and the respective collection indices will phase align and overlap-add or 
overlap-subtract respective audio games from the ct field 158 into the buffer, acolDH, audio 
collect delay half rate field 166. The acolb and acolDH buffers, 164 and 166, respectively, are 
modulo addressed as indicated by the bloclcs 192CFIG. 7) and are sized to have a length that is 
half die impulse size plus the number of taps in the 1 :2 upsample filter field 180, the tap length 
being added to the buffer size in order to feciliiate buffer tail overlap typical of FIR type filters. 

[Q055J After all half sample rate processing is ofiset according to appropriate phase by collection 
indices and overlap added into acolh, the 1:2 upsample bloclc field 180 converts the half sample 
rate data into the full sample rate and accumulates the result into the audio collect full sample 
rate buffer fieldl62 . 

[00561 After all half sample rate tail extension processing is offset according to appropriate 
phase by collection indices and overlap added into acolPH, die tail extension 1:2 upsample bloclc 
field 182 converts this tail extension half sample rate data into the full sample rate and 
.'! ) accumulates the result into the tail extension audio collect delay full rate buffer, acolO, field 168. 

10057] Tail extension processing is optionally enabled by the user in order to model the very end 
portion of an impulse response to mitigate the fact that convolution processing is very 
CPUintensive, More particularly,ratber than spend valuable computation time on portions of an 
impulse response that may be nearing the point of inaudibility or otherwise less significant than 
earlier portions of an impulse response, tail extension modeling employs an algorithmic model at 
a far lower computational load. For example, if an impulse response is 4 seconds in duration, the 
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last second may be modeled to save premium convolution processing time for only the early part 
of die response. 

[QQ58J FIG U is an exemplary tail extension model. The model illustrated in FIG. 11 is 
exemplary and includes a two basic routines; a copy scale routine asmcpyscale 207 and a filter 
routine asmfbJtfilt.209, as shown- Other configurations are also within the scope of the 
mvenfion-The audio data written into a read/write buffer acolD 168. As shown in FIG. S, the tail 
extension processing routine, processes this data in the buffer acolP and returns it to the buffer 
acol as shown in FIG. 5. 

10QS9J The later ponion of the convolution processing, for example the third second in our 4 
second impulse example, may be copied into a buffer, acolDH, at the half sample rate, or acouO 
at the foil sample rate. The tail extension model, similar W a conventional reverberation 
algorithm, is synchronized and applied to the late response. There are low pass filters for timbre 
matching, volume control for volume matching to the tail level of the actual impulse, feedback 
and overlap parameters, all of which facilitate a smooth transition from convolution processing 
to algorithm processing. 

[0060) An important aspect of the invention relates to embedding and controlling convolution 
technology within a sampler or synthesizer that is a music sampler for a music synthesiaer 
engine and what this technology will do is add to the description of a virtual musical instrument. 
One example relates to modeling an acoustic piano. In that example, the behavior of the piano 
soundboard resonate is emulated. In this example, the parameters that control the impulse 
response of the piano soundboard may be saved into a file description which contains both the 
original samples of the individual notes on the piano and control parameters to dynamically scale 
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the convolution perimeters in real-Time such that the behavior of an acoustic piano soundboard is 
tbe same as the model version, So, in essence, the system embeds and controls convolution- 
related parameters within a synthesizer engine-tuns embedding that convolution process inside 
the virtual musical instrument processing itself. Typically, a sampler or synthesizer engine 
includes an interpolator which gives you pitch control, a low-frequency oscillator or LFO, and an 
envelope generator. Envelope generators provides dynamic control of amplitude over time 
which are all processing audio which is routed through a convolution process where now other 
aspects of the control and modeling of the sound is coming from the synthesizer engine in 
dynamically controlling the convolution process. Examples of dynamically controlling the 
convolution process are,controllmg the pre and post convolution level control, damping of audio 
energy from within the convolution butTers for simulating a damping of a piano soundboard as 
when the damper pedal is raised, changing the wet/dry, adding and subtracting various impulse 
responses representing various attributes of a sound, and changing the "perspective control" In 
regards to perspective control," what mat is doing is changing the envelope of the impulse 
response in real-time as a musical instrument is being played. Sy combining all of these 
processes, physical instruments can be modeled with far greater detail and accuracy than before. 

{0061} Various file structures can be employed in which the impulse responses associated with 
the sound of a musical instrument, the control parameters associated with the impulse responses, 
the digital sound samples representing single or multiple notes of an instrument, control 
parameters for the synthesizer engine filters, tFO, envelope generators, interpolators, and sound 
generators are stored together into a file structure representation of a musical instrument. This 
file structure has single or multiple data fields representing each of these characters of the 
synthesized sound, which may be organized in a variety of ways using a variety of file data 
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types. This musical instrument file structure may include the amhient environment, instrument 
body resonance, microphone type, microphone placement, or other audio character of the 
synthesized sound. An example file structure is as follows: Impulse Response I... Impulse 
ResponseCn), impulse Response l impulse control I ...Impulse Response I impulse control(m), 
Impulse Response^) impulse control I ...Impulse Response(n) impulse controlCm), digital 
sound sample I . . -digital sound sample^), sampler engine control parameter 1 . . . sampler engine 
comrol parameter (n> synthesizer engine control parameter 1 ... synthesizer engine control 
parame ter (r), pointer to other file 1, ..pointer to other ftteCn). Together, these parameters 
represent the sound behavior of a musical instrument or sound texture generator, in which the 
impulse responses and their interactivity within the synthesizer engine via user performance data 
are contributing to the sound produced by the instrument model.] 

An exemplary channel data structure is illustrated below. The Channel Sequencing Routine 1 18 (FIG. 
4) chooses the particular pointers and controls that are fed into process chaunel[Jim-we need to provide 
better definition here]. Each instance of this data structure represents one dynamic channel data impulse 
block in Runtime Memory 100. Piecewise Convolution is done (in portions which are combined) by an 
^overlap-add" method (technically, overlap-subtract with a downstream phase reversal). 



typedef struct _mgDynajnicChannelData 



{ . 

Data Type 
Ipp32f 
32-bit flo{ 
point 


Variable/Field 
Hx[JMPSlZE*2]; 


Description ■ 

//filter impulse response (FIR Filter), im oomawt 
representation of the acoustic model being simulated (e.g., 
acoustic spase, microphone, musical instrument body 


Ipp32f 


acol[ACOlXBNGTH|; 


//audio collect (intermediate accumulator for overlap- 
044) — comprehends frame-based overlap and modulo 
addressing 

modulo addressed buffer steed of length impulse size (the 
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tune-domain lenfith of The impulse response) into which 
successive frames are overlap added 


Jpj>32f 


acQlH[(ACOLLBNGTH/2} + 
LPFTAPS1; 


//turbo collect Half stee for reduced sample-rate to save 
CPU overhead 


Ipp32f 


acoIDHCCACOJ-UENGTH^) + 
JLPFTAPS]; 


//turbo tail ext Delay collect Half size for reduced sample- 
rate to save CPU overhead 


]pp32f 


acolD f ACOIXEN GTH] ; 


//tail extension delay 


lpp32t 


tar4Vucj~eve4, 


//target mic level scale (new serpouiy 




aej-tvuci-evei, 


//aeita nuc level scale ^recursively calculated transition 
levej. gractuiaten prom one sample to tne next to smootn tne 
u3Hsmuuj 




riuuvii cue vb| , 


//rtunune uuc levei scaie ^original oero^ appueuj 


Ipp32f 


nmMicFerspec; 


//runtime mic perspective scale (affects the envelope of the 
impulse response — used for runtime scaling) applies a 
volume scaling to early part of response that increases or 

uc^.tC(U>c> vwii+UfC iu ure^it: perception or & ciose or tustant 
perspective on the wdio — correlated to some scape value 

thai makes sense for tlnf* user intfrrfhf^ of thf* frnn^T^nT 

distance of the sound source from the mic. 




tarDirectLeveU 


//tar&et Direct Sim stereo level scale 


Ipp32f 


delDirectLeveJ; 


//delta Direct Sim stereo level scale 


Ipp32f 


nrnDirectJ-eveU 


//runtime Direct Sim stereo level scale 




V*4*£>* W-**J- Vi^Vf 


//all wnTnPnt ^itTiniYi'if 

ft «* >*wt» WMUiuy 


DWORD 
uii&i fined 
integer 




// £>up utuuc 44v} r+|,4ww>-x_| — musx reference mta impulse 

resnOilfie /The /*\iTTf*nr nr>Ti^in Art whinli a nz\\t+\ i1 ^Tinn ic KaIho 

performed (A portion, which may be at a lower sample 
rate) 


PWORD 


acoliudexA; 


//audio collect buffer index - current index into collect 
buffers used for overlap-add 


DWORD 


acolindexprevA; 


//orcvious frame collection buffer index — nrevious index 

value in the collection or accumulation 


DWORD 


outindex; 


//collect buffer output index - index to where dam can be 
read from audio collect buffer f second from ton abovel and 
sent to output buffer 


DWORD 


hindexB; 


//sub frame fro Hfhindex] 


DWORD 


acolindexB; 


//audio pollect buffer index 


DWORD 
DWORD 


acolindexprevB; 


//previous frame collection buffer index 


DWORD 


dummyxxxx; 
dlyWindex; 


//alignment dummy (placeholder) 
//tail extension delay Write index (delay needed to align 
modeled portion of the impulse response with the actual 
result of the impulse response) 


DWORD 


dlyRindex; 


//tail extension delay Read index (delay needed to aligo 
modeled portion of the impulse response with the actual 
result of the impulse response) 


DWORD 


telA; 


//tail extension state variable (for Tail Extension Filter) 


DWORD 


te2A; 


//tail extension state variable | 



\ 
I 

. * 



-28 

AMENDED SHEET 



'AGE 31/36 1 RCVD AT 8/512005 3:00:20 PM (Eastern Daylight rime] * SVR:USPTO€FXRF4i/25 * DNIS:2733201 * CSID:3129021061 * DURATION (mm-ss):09-30 



PCT/UBOf/332«aO .OSD8EDOE 



13 mm m 



intM 

"3 



08-05-05 02:11pm From-KATTEN MUCHIN ROSENMAN 13129021061 



3120021061 



T-774 P. 32/36 F-363 



DWORD 


FcFbk; 


//tail extension state variable 






//fail extension qtaie variable 






//tAil evtensioTT state variable 






//tail £*■ YTf*n en nn ftTflte variable 






//toil pvtArt ci/vn oTfit'A V5vWa1^1r 






//tali cXISTl filQU a^t Y aria-Pic 


4->VVUKiJ 


aug&tej-, 


//I Vvx r#"« a IT <*■ ■ II ii irt* ivy 

// V o Dyfc au^puueni uurnury 


uvvuku 


aiignte<&; 


fJI/Z Tm«/4*A alt nnwiant /Ittnnmir 

// io pyre au^oment uuTuruy 






//t^4t extension* mi-pass nuner 




ADOf AT T Tj a CC OAXvTDTWCl- 

Ar^Ai-a- JrA&a J>AMr4-452>f, 


f/lmX extension, au^poSS ourter 


T__ »j *5 4? 

XppJ2l 


ad^MJ^Jsi i &j*4^IJ_oAM4^4-4^V 


//sun stereo uetay putter vjor siow processors; simulates 
stereo audio py using stereo 3Uiuo niters anu 
conipicjnentary camp ruiwrs. vjv iiv-?iN^vu 






//S4.UT picreo, vUwwi- itciay oiuxct 


DWORD 


AP1 n 


//AP bafifer read index 


DWORD 


AP2 r. 


//AP bufifer read index 


DWORD 


SSt r, 


//Sun Stereo buffer read index 


DWORD 


API w; 


//AP buffer write index offset 


DWORD 


AP2 w; 


//AP buffer write index offset 


DWORD 


tarSSt w; 


//target Sim Stereo buffer write index offset 


Ipp32f 


tarSStWidth; 


//target Sim stereo depth 


Ipp32f 


deiSStWidib; 


//delta Sim stereo depth 


Ipp32f 


runSStWidtb; 


//runtime Sim stereo depth 


DWORD 


4elSSt w; 


//delta Sim Stereo buffer write index offset 


DWORD 


nmSST_w; 


//runtime Sim Stereo buffer write index offset 


DWORD 


SStDD w; 


//sim stereo, direct delay write offset 




} PYNAMICCHANNELDATA, ^PDYNAMICCHANNELDATA; 
The foregoing description is for the purpose of teaching those skiUed in the art the best mode of 
carrying out the invention and is to be construed as illustrative only- Numerous modifications 
and alternative embodiments of the invention will be apparent to those skilled in the an in view 
of this description, and the details of the disclosed structure may he varied substantially without 
departing from the spirit of the invention. Accordingly, the exclusive use of all modifications 
within the scope of the appended claims is reserved. 



What is claimed and desired to be covered by a Letters Patent fellows; 
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CLAIMS 



What is claimed is: 

1. (Currently Amended). A synthesiser, comprising: 

raews for receiving m iupnt audio stream representing an audio performance aud 
including a plurality of audio input samples at a first sample rate; 

means for receiving data representing an impulse response that corresponds to an acoustic 
effect; and 

means for generating an output audio stream during a response time based on the input 

audio stream aud the impulse response by convolving the audio input samples with the data 
representing the impulse response for a portion of the response time and modeling an output 
audio system during the balance of the response time. 

2. The synthesizer of claim 1, further comprising means for receiving from a user an 
indication of the acoustic effect. 

3. The synthesizer of claim I, wherein the acoustic effect comprises an acoustic 



modification of the audio performance- 

* 

(Cancelled) 

4. (Currently Amended). The synthesizer of claim I, wherein the input audio stream 



5. (Currently Amended)- The synthesizer of claim 1, wherein the output audio stream 
includes a plurality of output channels- 




comprises a plurality of audio input samples for each of a plurality of input channels. 
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6. (Currently Amended). The synthesiser of cMm I, wherein the acoustic effect 
comprises acoustically simulating recording the audio performance using a particular 
microphone. 

7. (Currently Amended)- The synthesizer of claim J, wherein the acoustic effect 
comprises acoustically simulating recording toe audio performance using a particular 
microphone placement. 

8. (Currently Amended)- The synthesizer of claim 1, wherein the acoustic effect 
comprises acoustically simulating recording the audio performance jn a particular musical 
context 

9. (Currently Amended). The synthesizer of claim 1, wherein the acoustic effect 
comprises acoustically simulating playing a? least a portion of the audio performance using a 
particular instrument body, 

10. (Currently Amended). The synthesizer of claim h wherein the acoustic effect 
comprises acoustically simulating playing at least a portion of the audio performance using a 
particular instrument placement. 

11. (Currently Amended). The synthesizer of claim 1, wherein the generating means 
comprises means for recursively extrapolating a tail portion of the output audio stream. 

12. (Currently Amended). The synthesizer of claim I, wherein toe audio performance 
includes a first number of source channels, and wherein the output audio stream generated by the 
generating means includes a second number of output channels greater than the first number of 
source channels. 
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13. (Currently Amended). The synthesizer of claim 12 42, wherein the audio 
performance includes only a single source channel and wherein the output audio stream 
comprises a simulated stereo version of the single source channel. 



•V 



14. (New), An acoustic synthesizer for synthesizing one or more acoustic effects, the 
acoustic synthesizer comprising: 

an input subsystem for receiving an input audio stream and storing said input audio 
stream in a predetermined file structure; and 

an acoustic synthesizer subsystem for emulating an acoustic effect and generating an 
output audio stream as a function of said input audio stream and said acoustic effect defined by 
one or more acoustic parameters, said acoustic parameters stored in said predetermined file 
structure. 

15. (New). The acoustic synthesizer as recited in claim 14, wherein said acoustic 
synthesizer subsystem includes a system for varying the acoustic effect and resulting output 
audio stream in real time. 

16. (New). The acoustic synthesizer as recited in claim 14, wherein said 
predetermined file structure includes a plurality of data fields. 

17. (New), The acoustic synthesizer as recited in claim 16, wherein said plurality 
of data fields define a plurality of data types, 

18. (New). The acoustic synthesizer as recited in claim 17, wherein at least one of 
said data types includes ambient environment data. 

19. (New). The acoustic synthesizer as recited in claim 16, wherein said plurality of 
data fields define an instrument.. 

20. (New). The acoustic synthesizer as recited in claim 16> wherein said plurality of 
data fields define a microphone type. 

21. (New). The acoustic synthesizer as recited in claim 16, wherein said plurality of 
data fields define a document. 
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